Overview

Dataset statistics

Number of variables21
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory716.1 KiB
Average record size in memory733.3 B

Variable types

Categorical12
Numeric8
DateTime1

Warnings

gross margin percentage has constant value "4.761904762" Constant
Years has constant value "2019" Constant
Invoice ID has a high cardinality: 1000 distinct values High cardinality
Time has a high cardinality: 506 distinct values High cardinality
Unit price is highly correlated with Tax 5% and 3 other fieldsHigh correlation
Quantity is highly correlated with Tax 5% and 3 other fieldsHigh correlation
Tax 5% is highly correlated with Unit price and 4 other fieldsHigh correlation
Total is highly correlated with Unit price and 4 other fieldsHigh correlation
cogs is highly correlated with Unit price and 4 other fieldsHigh correlation
gross income is highly correlated with Unit price and 4 other fieldsHigh correlation
Unit price is highly correlated with Tax 5% and 3 other fieldsHigh correlation
Quantity is highly correlated with Tax 5% and 3 other fieldsHigh correlation
Tax 5% is highly correlated with Unit price and 4 other fieldsHigh correlation
Total is highly correlated with Unit price and 4 other fieldsHigh correlation
cogs is highly correlated with Unit price and 4 other fieldsHigh correlation
gross income is highly correlated with Unit price and 4 other fieldsHigh correlation
Quantity is highly correlated with Tax 5% and 3 other fieldsHigh correlation
Tax 5% is highly correlated with Quantity and 3 other fieldsHigh correlation
Total is highly correlated with Quantity and 3 other fieldsHigh correlation
cogs is highly correlated with Quantity and 3 other fieldsHigh correlation
gross income is highly correlated with Quantity and 3 other fieldsHigh correlation
Unit price is highly correlated with cogs and 3 other fieldsHigh correlation
Day is highly correlated with Date and 1 other fieldsHigh correlation
cogs is highly correlated with Unit price and 4 other fieldsHigh correlation
City is highly correlated with BranchHigh correlation
Month is highly correlated with DateHigh correlation
Date is highly correlated with Day and 2 other fieldsHigh correlation
Total is highly correlated with Unit price and 4 other fieldsHigh correlation
Branch is highly correlated with CityHigh correlation
gross income is highly correlated with Unit price and 4 other fieldsHigh correlation
Weeday is highly correlated with Day and 1 other fieldsHigh correlation
Quantity is highly correlated with cogs and 3 other fieldsHigh correlation
Tax 5% is highly correlated with Unit price and 4 other fieldsHigh correlation
Day is highly correlated with gross margin percentage and 1 other fieldsHigh correlation
gross margin percentage is highly correlated with Day and 8 other fieldsHigh correlation
City is highly correlated with gross margin percentage and 2 other fieldsHigh correlation
Customer type is highly correlated with gross margin percentage and 1 other fieldsHigh correlation
Years is highly correlated with Day and 8 other fieldsHigh correlation
Month is highly correlated with gross margin percentage and 1 other fieldsHigh correlation
Product line is highly correlated with gross margin percentage and 1 other fieldsHigh correlation
Gender is highly correlated with gross margin percentage and 1 other fieldsHigh correlation
Payment is highly correlated with gross margin percentage and 1 other fieldsHigh correlation
Branch is highly correlated with gross margin percentage and 2 other fieldsHigh correlation
Invoice ID is uniformly distributed Uniform
Time is uniformly distributed Uniform
Invoice ID has unique values Unique
Weeday has 125 (12.5%) zeros Zeros

Reproduction

Analysis started2021-08-05 07:47:45.960101
Analysis finished2021-08-05 07:48:01.590175
Duration15.63 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Invoice ID
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size66.5 KiB
781-84-8059
 
1
139-52-2867
 
1
719-89-8991
 
1
109-86-4363
 
1
380-94-4661
 
1
Other values (995)
995 

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1000 ?
Unique (%)100.0%

Sample

1st row750-67-8428
2nd row226-31-3081
3rd row631-41-3108
4th row123-19-1176
5th row373-73-7910

Common Values

ValueCountFrequency (%)
781-84-80591
 
0.1%
139-52-28671
 
0.1%
719-89-89911
 
0.1%
109-86-43631
 
0.1%
380-94-46611
 
0.1%
787-15-17571
 
0.1%
767-54-19071
 
0.1%
788-07-84521
 
0.1%
325-90-87631
 
0.1%
316-55-46341
 
0.1%
Other values (990)990
99.0%

Length

2021-08-05T09:48:01.845217image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
781-84-80591
 
0.1%
139-52-28671
 
0.1%
719-89-89911
 
0.1%
109-86-43631
 
0.1%
380-94-46611
 
0.1%
787-15-17571
 
0.1%
767-54-19071
 
0.1%
788-07-84521
 
0.1%
325-90-87631
 
0.1%
316-55-46341
 
0.1%
Other values (990)990
99.0%

Most occurring characters

ValueCountFrequency (%)
-2000
18.2%
2957
8.7%
6954
8.7%
1950
8.6%
8944
8.6%
5927
8.4%
4918
8.3%
3909
8.3%
7895
8.1%
0809
7.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9000
81.8%
Dash Punctuation2000
 
18.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2957
10.6%
6954
10.6%
1950
10.6%
8944
10.5%
5927
10.3%
4918
10.2%
3909
10.1%
7895
9.9%
0809
9.0%
9737
8.2%
Dash Punctuation
ValueCountFrequency (%)
-2000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common11000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
-2000
18.2%
2957
8.7%
6954
8.7%
1950
8.6%
8944
8.6%
5927
8.4%
4918
8.3%
3909
8.3%
7895
8.1%
0809
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII11000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
-2000
18.2%
2957
8.7%
6954
8.7%
1950
8.6%
8944
8.6%
5927
8.4%
4918
8.3%
3909
8.3%
7895
8.1%
0809
7.4%

Branch
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size56.8 KiB
A
340 
B
332 
C
328 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowC
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A340
34.0%
B332
33.2%
C328
32.8%

Length

2021-08-05T09:48:02.095199image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:02.173320image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
a340
34.0%
b332
33.2%
c328
32.8%

Most occurring characters

ValueCountFrequency (%)
A340
34.0%
B332
33.2%
C328
32.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1000
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A340
34.0%
B332
33.2%
C328
32.8%

Most occurring scripts

ValueCountFrequency (%)
Latin1000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A340
34.0%
B332
33.2%
C328
32.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A340
34.0%
B332
33.2%
C328
32.8%

City
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size63.3 KiB
Yangon
340 
Mandalay
332 
Naypyitaw
328 

Length

Max length9
Median length8
Mean length7.648
Min length6

Characters and Unicode

Total characters7648
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYangon
2nd rowNaypyitaw
3rd rowYangon
4th rowYangon
5th rowYangon

Common Values

ValueCountFrequency (%)
Yangon340
34.0%
Mandalay332
33.2%
Naypyitaw328
32.8%

Length

2021-08-05T09:48:02.423296image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:02.783800image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
yangon340
34.0%
mandalay332
33.2%
naypyitaw328
32.8%

Most occurring characters

ValueCountFrequency (%)
a1992
26.0%
n1012
13.2%
y988
12.9%
Y340
 
4.4%
g340
 
4.4%
o340
 
4.4%
M332
 
4.3%
d332
 
4.3%
l332
 
4.3%
N328
 
4.3%
Other values (4)1312
17.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6648
86.9%
Uppercase Letter1000
 
13.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1992
30.0%
n1012
15.2%
y988
14.9%
g340
 
5.1%
o340
 
5.1%
d332
 
5.0%
l332
 
5.0%
p328
 
4.9%
i328
 
4.9%
t328
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
Y340
34.0%
M332
33.2%
N328
32.8%

Most occurring scripts

ValueCountFrequency (%)
Latin7648
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1992
26.0%
n1012
13.2%
y988
12.9%
Y340
 
4.4%
g340
 
4.4%
o340
 
4.4%
M332
 
4.3%
d332
 
4.3%
l332
 
4.3%
N328
 
4.3%
Other values (4)1312
17.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII7648
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1992
26.0%
n1012
13.2%
y988
12.9%
Y340
 
4.4%
g340
 
4.4%
o340
 
4.4%
M332
 
4.3%
d332
 
4.3%
l332
 
4.3%
N328
 
4.3%
Other values (4)1312
17.2%

Customer type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size61.6 KiB
Member
501 
Normal
499 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6000
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMember
2nd rowNormal
3rd rowNormal
4th rowMember
5th rowNormal

Common Values

ValueCountFrequency (%)
Member501
50.1%
Normal499
49.9%

Length

2021-08-05T09:48:03.018157image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:03.111900image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
member501
50.1%
normal499
49.9%

Most occurring characters

ValueCountFrequency (%)
e1002
16.7%
m1000
16.7%
r1000
16.7%
M501
8.3%
b501
8.3%
N499
8.3%
o499
8.3%
a499
8.3%
l499
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5000
83.3%
Uppercase Letter1000
 
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1002
20.0%
m1000
20.0%
r1000
20.0%
b501
10.0%
o499
10.0%
a499
10.0%
l499
10.0%
Uppercase Letter
ValueCountFrequency (%)
M501
50.1%
N499
49.9%

Most occurring scripts

ValueCountFrequency (%)
Latin6000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1002
16.7%
m1000
16.7%
r1000
16.7%
M501
8.3%
b501
8.3%
N499
8.3%
o499
8.3%
a499
8.3%
l499
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII6000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1002
16.7%
m1000
16.7%
r1000
16.7%
M501
8.3%
b501
8.3%
N499
8.3%
o499
8.3%
a499
8.3%
l499
8.3%

Gender
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size60.7 KiB
Female
501 
Male
499 

Length

Max length6
Median length6
Mean length5.002
Min length4

Characters and Unicode

Total characters5002
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowMale
4th rowMale
5th rowMale

Common Values

ValueCountFrequency (%)
Female501
50.1%
Male499
49.9%

Length

2021-08-05T09:48:03.346258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:03.455626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
female501
50.1%
male499
49.9%

Most occurring characters

ValueCountFrequency (%)
e1501
30.0%
a1000
20.0%
l1000
20.0%
F501
 
10.0%
m501
 
10.0%
M499
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4002
80.0%
Uppercase Letter1000
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1501
37.5%
a1000
25.0%
l1000
25.0%
m501
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
F501
50.1%
M499
49.9%

Most occurring scripts

ValueCountFrequency (%)
Latin5002
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1501
30.0%
a1000
20.0%
l1000
20.0%
F501
 
10.0%
m501
 
10.0%
M499
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5002
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1501
30.0%
a1000
20.0%
l1000
20.0%
F501
 
10.0%
m501
 
10.0%
M499
 
10.0%

Product line
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size73.9 KiB
Fashion accessories
178 
Food and beverages
174 
Electronic accessories
170 
Sports and travel
166 
Home and lifestyle
160 

Length

Max length22
Median length18
Mean length18.54
Min length17

Characters and Unicode

Total characters18540
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHealth and beauty
2nd rowElectronic accessories
3rd rowHome and lifestyle
4th rowHealth and beauty
5th rowSports and travel

Common Values

ValueCountFrequency (%)
Fashion accessories178
17.8%
Food and beverages174
17.4%
Electronic accessories170
17.0%
Sports and travel166
16.6%
Home and lifestyle160
16.0%
Health and beauty152
15.2%

Length

2021-08-05T09:48:03.676475image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:03.771379image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
and652
24.6%
accessories348
13.1%
fashion178
 
6.7%
food174
 
6.6%
beverages174
 
6.6%
electronic170
 
6.4%
sports166
 
6.3%
travel166
 
6.3%
home160
 
6.0%
lifestyle160
 
6.0%
Other values (2)304
11.5%

Most occurring characters

ValueCountFrequency (%)
e2338
12.6%
a1822
 
9.8%
s1722
 
9.3%
1652
 
8.9%
o1370
 
7.4%
c1036
 
5.6%
r1024
 
5.5%
n1000
 
5.4%
t966
 
5.2%
i856
 
4.6%
Other values (15)4754
25.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter15888
85.7%
Space Separator1652
 
8.9%
Uppercase Letter1000
 
5.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2338
14.7%
a1822
11.5%
s1722
10.8%
o1370
8.6%
c1036
 
6.5%
r1024
 
6.4%
n1000
 
6.3%
t966
 
6.1%
i856
 
5.4%
d826
 
5.2%
Other values (10)2928
18.4%
Uppercase Letter
ValueCountFrequency (%)
F352
35.2%
H312
31.2%
E170
17.0%
S166
16.6%
Space Separator
ValueCountFrequency (%)
1652
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin16888
91.1%
Common1652
 
8.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2338
13.8%
a1822
10.8%
s1722
10.2%
o1370
 
8.1%
c1036
 
6.1%
r1024
 
6.1%
n1000
 
5.9%
t966
 
5.7%
i856
 
5.1%
d826
 
4.9%
Other values (14)3928
23.3%
Common
ValueCountFrequency (%)
1652
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII18540
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e2338
12.6%
a1822
 
9.8%
s1722
 
9.3%
1652
 
8.9%
o1370
 
7.4%
c1036
 
5.6%
r1024
 
5.5%
n1000
 
5.4%
t966
 
5.2%
i856
 
4.6%
Other values (15)4754
25.6%

Unit price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct943
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.67213
Minimum10.08
Maximum99.96
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2021-08-05T09:48:03.943240image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10.08
5-th percentile15.279
Q132.875
median55.23
Q377.935
95-th percentile97.222
Maximum99.96
Range89.88
Interquartile range (IQR)45.06

Descriptive statistics

Standard deviation26.49462835
Coefficient of variation (CV)0.4759047004
Kurtosis-1.218591428
Mean55.67213
Median Absolute Deviation (MAD)22.505
Skewness0.007077447853
Sum55672.13
Variance701.9653313
MonotonicityNot monotonic
2021-08-05T09:48:04.099479image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
83.773
 
0.3%
65.942
 
0.2%
34.422
 
0.2%
98.72
 
0.2%
39.622
 
0.2%
37.152
 
0.2%
99.962
 
0.2%
36.362
 
0.2%
42.572
 
0.2%
60.952
 
0.2%
Other values (933)979
97.9%
ValueCountFrequency (%)
10.081
0.1%
10.131
0.1%
10.161
0.1%
10.171
0.1%
10.181
0.1%
10.531
0.1%
10.561
0.1%
10.591
0.1%
10.691
0.1%
10.751
0.1%
ValueCountFrequency (%)
99.962
0.2%
99.921
0.1%
99.891
0.1%
99.831
0.1%
99.822
0.2%
99.791
0.1%
99.781
0.1%
99.731
0.1%
99.711
0.1%
99.71
0.1%

Quantity
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.51
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2021-08-05T09:48:04.240093image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile10
Maximum10
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.923430595
Coefficient of variation (CV)0.5305681661
Kurtosis-1.215547226
Mean5.51
Median Absolute Deviation (MAD)2
Skewness0.01294104802
Sum5510
Variance8.546446446
MonotonicityNot monotonic
2021-08-05T09:48:04.333837image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
10119
11.9%
1112
11.2%
4109
10.9%
5102
10.2%
7102
10.2%
698
9.8%
992
9.2%
291
9.1%
390
9.0%
885
8.5%
ValueCountFrequency (%)
1112
11.2%
291
9.1%
390
9.0%
4109
10.9%
5102
10.2%
698
9.8%
7102
10.2%
885
8.5%
992
9.2%
10119
11.9%
ValueCountFrequency (%)
10119
11.9%
992
9.2%
885
8.5%
7102
10.2%
698
9.8%
5102
10.2%
4109
10.9%
390
9.0%
291
9.1%
1112
11.2%

Tax 5%
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct990
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.379369
Minimum0.5085
Maximum49.65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2021-08-05T09:48:04.474453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.5085
5-th percentile1.955725
Q15.924875
median12.088
Q322.44525
95-th percentile39.1665
Maximum49.65
Range49.1415
Interquartile range (IQR)16.520375

Descriptive statistics

Standard deviation11.70882548
Coefficient of variation (CV)0.7613332823
Kurtosis-0.0818847579
Mean15.379369
Median Absolute Deviation (MAD)7.50875
Skewness0.892569805
Sum15379.369
Variance137.0965941
MonotonicityNot monotonic
2021-08-05T09:48:04.646314image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.572
 
0.2%
4.4642
 
0.2%
10.36352
 
0.2%
9.00452
 
0.2%
22.4282
 
0.2%
10.3262
 
0.2%
8.3772
 
0.2%
39.482
 
0.2%
4.1542
 
0.2%
13.1882
 
0.2%
Other values (980)980
98.0%
ValueCountFrequency (%)
0.50851
0.1%
0.60451
0.1%
0.6271
0.1%
0.6391
0.1%
0.6991
0.1%
0.7671
0.1%
0.77151
0.1%
0.7751
0.1%
0.8141
0.1%
0.88751
0.1%
ValueCountFrequency (%)
49.651
0.1%
49.491
0.1%
49.261
0.1%
48.751
0.1%
48.691
0.1%
48.6851
0.1%
48.6051
0.1%
47.791
0.1%
47.721
0.1%
45.3251
0.1%

Total
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct990
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean322.966749
Minimum10.6785
Maximum1042.65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2021-08-05T09:48:04.806426image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10.6785
5-th percentile41.070225
Q1124.422375
median253.848
Q3471.35025
95-th percentile822.4965
Maximum1042.65
Range1031.9715
Interquartile range (IQR)346.927875

Descriptive statistics

Standard deviation245.8853351
Coefficient of variation (CV)0.7613332823
Kurtosis-0.0818847579
Mean322.966749
Median Absolute Deviation (MAD)157.68375
Skewness0.892569805
Sum322966.749
Variance60459.59802
MonotonicityNot monotonic
2021-08-05T09:48:04.962665image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
93.7442
 
0.2%
189.09452
 
0.2%
829.082
 
0.2%
87.2342
 
0.2%
470.9882
 
0.2%
216.8462
 
0.2%
175.9172
 
0.2%
217.63352
 
0.2%
263.972
 
0.2%
276.9482
 
0.2%
Other values (980)980
98.0%
ValueCountFrequency (%)
10.67851
0.1%
12.69451
0.1%
13.1671
0.1%
13.4191
0.1%
14.6791
0.1%
16.1071
0.1%
16.20151
0.1%
16.2751
0.1%
17.0941
0.1%
18.63751
0.1%
ValueCountFrequency (%)
1042.651
0.1%
1039.291
0.1%
1034.461
0.1%
1023.751
0.1%
1022.491
0.1%
1022.3851
0.1%
1020.7051
0.1%
1003.591
0.1%
1002.121
0.1%
951.8251
0.1%

Date
Date

HIGH CORRELATION

Distinct89
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2019-01-01 00:00:00
Maximum2019-03-30 00:00:00
2021-08-05T09:48:05.134531image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:48:05.306390image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Time
Categorical

HIGH CARDINALITY
UNIFORM

Distinct506
Distinct (%)50.6%
Missing0
Missing (%)0.0%
Memory size60.7 KiB
14:42
 
7
19:48
 
7
17:38
 
6
19:44
 
5
17:36
 
5
Other values (501)
970 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique210 ?
Unique (%)21.0%

Sample

1st row13:08
2nd row10:29
3rd row13:23
4th row20:33
5th row10:37

Common Values

ValueCountFrequency (%)
14:427
 
0.7%
19:487
 
0.7%
17:386
 
0.6%
19:445
 
0.5%
17:365
 
0.5%
10:115
 
0.5%
11:405
 
0.5%
19:305
 
0.5%
17:165
 
0.5%
11:515
 
0.5%
Other values (496)945
94.5%

Length

2021-08-05T09:48:05.665740image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
14:427
 
0.7%
19:487
 
0.7%
17:386
 
0.6%
19:445
 
0.5%
17:365
 
0.5%
10:115
 
0.5%
11:405
 
0.5%
19:305
 
0.5%
17:165
 
0.5%
11:515
 
0.5%
Other values (496)945
94.5%

Most occurring characters

ValueCountFrequency (%)
11250
25.0%
:1000
20.0%
2441
 
8.8%
0437
 
8.7%
3378
 
7.6%
4376
 
7.5%
5354
 
7.1%
8216
 
4.3%
9200
 
4.0%
6184
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4000
80.0%
Other Punctuation1000
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11250
31.2%
2441
 
11.0%
0437
 
10.9%
3378
 
9.4%
4376
 
9.4%
5354
 
8.8%
8216
 
5.4%
9200
 
5.0%
6184
 
4.6%
7164
 
4.1%
Other Punctuation
ValueCountFrequency (%)
:1000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common5000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
11250
25.0%
:1000
20.0%
2441
 
8.8%
0437
 
8.7%
3378
 
7.6%
4376
 
7.5%
5354
 
7.1%
8216
 
4.3%
9200
 
4.0%
6184
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII5000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11250
25.0%
:1000
20.0%
2441
 
8.8%
0437
 
8.7%
3378
 
7.6%
4376
 
7.5%
5354
 
7.1%
8216
 
4.3%
9200
 
4.0%
6184
 
3.7%

Payment
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size62.8 KiB
Ewallet
345 
Cash
344 
Credit card
311 

Length

Max length11
Median length7
Mean length7.212
Min length4

Characters and Unicode

Total characters7212
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEwallet
2nd rowCash
3rd rowCredit card
4th rowEwallet
5th rowEwallet

Common Values

ValueCountFrequency (%)
Ewallet345
34.5%
Cash344
34.4%
Credit card311
31.1%

Length

2021-08-05T09:48:05.948087image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:06.041832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
ewallet345
26.3%
cash344
26.2%
credit311
23.7%
card311
23.7%

Most occurring characters

ValueCountFrequency (%)
a1000
13.9%
l690
9.6%
e656
9.1%
t656
9.1%
C655
9.1%
r622
8.6%
d622
8.6%
E345
 
4.8%
w345
 
4.8%
s344
 
4.8%
Other values (4)1277
17.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5901
81.8%
Uppercase Letter1000
 
13.9%
Space Separator311
 
4.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1000
16.9%
l690
11.7%
e656
11.1%
t656
11.1%
r622
10.5%
d622
10.5%
w345
 
5.8%
s344
 
5.8%
h344
 
5.8%
i311
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
C655
65.5%
E345
34.5%
Space Separator
ValueCountFrequency (%)
311
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6901
95.7%
Common311
 
4.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1000
14.5%
l690
10.0%
e656
9.5%
t656
9.5%
C655
9.5%
r622
9.0%
d622
9.0%
E345
 
5.0%
w345
 
5.0%
s344
 
5.0%
Other values (3)966
14.0%
Common
ValueCountFrequency (%)
311
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII7212
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1000
13.9%
l690
9.6%
e656
9.1%
t656
9.1%
C655
9.1%
r622
8.6%
d622
8.6%
E345
 
4.8%
w345
 
4.8%
s344
 
4.8%
Other values (4)1277
17.7%

cogs
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct990
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean307.58738
Minimum10.17
Maximum993
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2021-08-05T09:48:06.166818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10.17
5-th percentile39.1145
Q1118.4975
median241.76
Q3448.905
95-th percentile783.33
Maximum993
Range982.83
Interquartile range (IQR)330.4075

Descriptive statistics

Standard deviation234.1765096
Coefficient of variation (CV)0.7613332823
Kurtosis-0.0818847579
Mean307.58738
Median Absolute Deviation (MAD)150.175
Skewness0.892569805
Sum307587.38
Variance54838.63766
MonotonicityNot monotonic
2021-08-05T09:48:06.323058image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
448.562
 
0.2%
83.082
 
0.2%
167.542
 
0.2%
251.42
 
0.2%
206.522
 
0.2%
263.762
 
0.2%
180.092
 
0.2%
207.272
 
0.2%
89.282
 
0.2%
789.62
 
0.2%
Other values (980)980
98.0%
ValueCountFrequency (%)
10.171
0.1%
12.091
0.1%
12.541
0.1%
12.781
0.1%
13.981
0.1%
15.341
0.1%
15.431
0.1%
15.51
0.1%
16.281
0.1%
17.751
0.1%
ValueCountFrequency (%)
9931
0.1%
989.81
0.1%
985.21
0.1%
9751
0.1%
973.81
0.1%
973.71
0.1%
972.11
0.1%
955.81
0.1%
954.41
0.1%
906.51
0.1%

gross margin percentage
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size66.5 KiB
4.761904762
1000 

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11000
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4.761904762
2nd row4.761904762
3rd row4.761904762
4th row4.761904762
5th row4.761904762

Common Values

ValueCountFrequency (%)
4.7619047621000
100.0%

Length

2021-08-05T09:48:06.589796image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:06.683540image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
4.7619047621000
100.0%

Most occurring characters

ValueCountFrequency (%)
42000
18.2%
72000
18.2%
62000
18.2%
.1000
9.1%
11000
9.1%
91000
9.1%
01000
9.1%
21000
9.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number10000
90.9%
Other Punctuation1000
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
42000
20.0%
72000
20.0%
62000
20.0%
11000
10.0%
91000
10.0%
01000
10.0%
21000
10.0%
Other Punctuation
ValueCountFrequency (%)
.1000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common11000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
42000
18.2%
72000
18.2%
62000
18.2%
.1000
9.1%
11000
9.1%
91000
9.1%
01000
9.1%
21000
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII11000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
42000
18.2%
72000
18.2%
62000
18.2%
.1000
9.1%
11000
9.1%
91000
9.1%
01000
9.1%
21000
9.1%

gross income
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct990
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.379369
Minimum0.5085
Maximum49.65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2021-08-05T09:48:06.762766image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.5085
5-th percentile1.955725
Q15.924875
median12.088
Q322.44525
95-th percentile39.1665
Maximum49.65
Range49.1415
Interquartile range (IQR)16.520375

Descriptive statistics

Standard deviation11.70882548
Coefficient of variation (CV)0.7613332823
Kurtosis-0.0818847579
Mean15.379369
Median Absolute Deviation (MAD)7.50875
Skewness0.892569805
Sum15379.369
Variance137.0965941
MonotonicityNot monotonic
2021-08-05T09:48:06.934643image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.572
 
0.2%
4.4642
 
0.2%
10.36352
 
0.2%
9.00452
 
0.2%
22.4282
 
0.2%
10.3262
 
0.2%
8.3772
 
0.2%
39.482
 
0.2%
4.1542
 
0.2%
13.1882
 
0.2%
Other values (980)980
98.0%
ValueCountFrequency (%)
0.50851
0.1%
0.60451
0.1%
0.6271
0.1%
0.6391
0.1%
0.6991
0.1%
0.7671
0.1%
0.77151
0.1%
0.7751
0.1%
0.8141
0.1%
0.88751
0.1%
ValueCountFrequency (%)
49.651
0.1%
49.491
0.1%
49.261
0.1%
48.751
0.1%
48.691
0.1%
48.6851
0.1%
48.6051
0.1%
47.791
0.1%
47.721
0.1%
45.3251
0.1%

Rating
Real number (ℝ≥0)

Distinct61
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.9727
Minimum4
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2021-08-05T09:48:07.090882image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile4.295
Q15.5
median7
Q38.5
95-th percentile9.7
Maximum10
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.718580294
Coefficient of variation (CV)0.2464727142
Kurtosis-1.151586839
Mean6.9727
Median Absolute Deviation (MAD)1.5
Skewness0.009009648766
Sum6972.7
Variance2.953518228
MonotonicityNot monotonic
2021-08-05T09:48:07.247121image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
626
 
2.6%
6.624
 
2.4%
9.522
 
2.2%
4.222
 
2.2%
6.221
 
2.1%
6.521
 
2.1%
821
 
2.1%
521
 
2.1%
5.121
 
2.1%
720
 
2.0%
Other values (51)781
78.1%
ValueCountFrequency (%)
411
1.1%
4.117
1.7%
4.222
2.2%
4.318
1.8%
4.417
1.7%
4.517
1.7%
4.68
 
0.8%
4.712
1.2%
4.813
1.3%
4.918
1.8%
ValueCountFrequency (%)
105
 
0.5%
9.916
1.6%
9.819
1.9%
9.714
1.4%
9.617
1.7%
9.522
2.2%
9.412
1.2%
9.316
1.6%
9.216
1.6%
9.114
1.4%

Years
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size59.7 KiB
2019
1000 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
20191000
100.0%

Length

2021-08-05T09:48:07.512727image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:07.606470image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
20191000
100.0%

Most occurring characters

ValueCountFrequency (%)
21000
25.0%
01000
25.0%
11000
25.0%
91000
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
21000
25.0%
01000
25.0%
11000
25.0%
91000
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common4000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
21000
25.0%
01000
25.0%
11000
25.0%
91000
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII4000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
21000
25.0%
01000
25.0%
11000
25.0%
91000
25.0%

Month
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size62.2 KiB
January
352 
March
345 
February
303 

Length

Max length8
Median length7
Mean length6.613
Min length5

Characters and Unicode

Total characters6613
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJanuary
2nd rowMarch
3rd rowMarch
4th rowJanuary
5th rowFebruary

Common Values

ValueCountFrequency (%)
January352
35.2%
March345
34.5%
February303
30.3%

Length

2021-08-05T09:48:07.810737image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:07.904485image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
january352
35.2%
march345
34.5%
february303
30.3%

Most occurring characters

ValueCountFrequency (%)
a1352
20.4%
r1303
19.7%
u655
9.9%
y655
9.9%
J352
 
5.3%
n352
 
5.3%
M345
 
5.2%
c345
 
5.2%
h345
 
5.2%
F303
 
4.6%
Other values (2)606
9.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5613
84.9%
Uppercase Letter1000
 
15.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1352
24.1%
r1303
23.2%
u655
11.7%
y655
11.7%
n352
 
6.3%
c345
 
6.1%
h345
 
6.1%
e303
 
5.4%
b303
 
5.4%
Uppercase Letter
ValueCountFrequency (%)
J352
35.2%
M345
34.5%
F303
30.3%

Most occurring scripts

ValueCountFrequency (%)
Latin6613
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1352
20.4%
r1303
19.7%
u655
9.9%
y655
9.9%
J352
 
5.3%
n352
 
5.3%
M345
 
5.2%
c345
 
5.2%
h345
 
5.2%
F303
 
4.6%
Other values (2)606
9.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII6613
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1352
20.4%
r1303
19.7%
u655
9.9%
y655
9.9%
J352
 
5.3%
n352
 
5.3%
M345
 
5.2%
c345
 
5.2%
h345
 
5.2%
F303
 
4.6%
Other values (2)606
9.2%

Day
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size62.8 KiB
Saturday
164 
Tuesday
158 
Wednesday
143 
Friday
139 
Thursday
138 
Other values (2)
258 

Length

Max length9
Median length7
Mean length7.191
Min length6

Characters and Unicode

Total characters7191
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSaturday
2nd rowFriday
3rd rowSunday
4th rowSunday
5th rowFriday

Common Values

ValueCountFrequency (%)
Saturday164
16.4%
Tuesday158
15.8%
Wednesday143
14.3%
Friday139
13.9%
Thursday138
13.8%
Sunday133
13.3%
Monday125
12.5%

Length

2021-08-05T09:48:08.185710image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-05T09:48:08.295077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
saturday164
16.4%
tuesday158
15.8%
wednesday143
14.3%
friday139
13.9%
thursday138
13.8%
sunday133
13.3%
monday125
12.5%

Most occurring characters

ValueCountFrequency (%)
a1164
16.2%
d1143
15.9%
y1000
13.9%
u593
8.2%
e444
 
6.2%
r441
 
6.1%
s439
 
6.1%
n401
 
5.6%
S297
 
4.1%
T296
 
4.1%
Other values (7)973
13.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6191
86.1%
Uppercase Letter1000
 
13.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1164
18.8%
d1143
18.5%
y1000
16.2%
u593
9.6%
e444
 
7.2%
r441
 
7.1%
s439
 
7.1%
n401
 
6.5%
t164
 
2.6%
i139
 
2.2%
Other values (2)263
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
S297
29.7%
T296
29.6%
W143
14.3%
F139
13.9%
M125
12.5%

Most occurring scripts

ValueCountFrequency (%)
Latin7191
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1164
16.2%
d1143
15.9%
y1000
13.9%
u593
8.2%
e444
 
6.2%
r441
 
6.1%
s439
 
6.1%
n401
 
5.6%
S297
 
4.1%
T296
 
4.1%
Other values (7)973
13.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII7191
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1164
16.2%
d1143
15.9%
y1000
13.9%
u593
8.2%
e444
 
6.2%
r441
 
6.1%
s439
 
6.1%
n401
 
5.6%
S297
 
4.1%
T296
 
4.1%
Other values (7)973
13.5%

Weeday
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.032
Minimum0
Maximum6
Zeros125
Zeros (%)12.5%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2021-08-05T09:48:08.436834image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q35
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.973542721
Coefficient of variation (CV)0.6509045913
Kurtosis-1.261655451
Mean3.032
Median Absolute Deviation (MAD)2
Skewness-0.0148188492
Sum3032
Variance3.894870871
MonotonicityNot monotonic
2021-08-05T09:48:08.530855image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5164
16.4%
1158
15.8%
2143
14.3%
4139
13.9%
3138
13.8%
6133
13.3%
0125
12.5%
ValueCountFrequency (%)
0125
12.5%
1158
15.8%
2143
14.3%
3138
13.8%
4139
13.9%
5164
16.4%
6133
13.3%
ValueCountFrequency (%)
6133
13.3%
5164
16.4%
4139
13.9%
3138
13.8%
2143
14.3%
1158
15.8%
0125
12.5%

Interactions

2021-08-05T09:47:51.007337image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:51.163577image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:51.304192image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:51.463501image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:51.613285image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:51.754976image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:51.911232image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:52.067469image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:52.208089image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:52.567437image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:52.718106image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:52.863505image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:53.005405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:53.146019image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:53.283184image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:53.423799image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:53.574644image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:53.719268image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:53.848071image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:54.004305image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:54.144923image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:54.285521image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:54.441759image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:54.582374image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:54.738614image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:54.885067image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:55.025683image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:55.166296image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:55.322536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:55.478772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:55.619388image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:55.769321image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:55.920400image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:56.062148image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:56.202782image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:56.343399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:56.499634image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:56.640253image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:56.781999image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:56.938233image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:57.078852image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:57.235087image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:57.375689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:57.531929image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:57.672560image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:57.829921image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:57.970542image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:58.126774image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:58.267395image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:58.423632image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:58.564246image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:58.722608image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:58.864341image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:59.020577image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:59.176815image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:59.317434image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:59.504920image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:59.646690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:59.791188image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:47:59.947443image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:48:00.103685image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:48:00.259920image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:48:00.400537image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-05T09:48:00.556778image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-08-05T09:48:08.660123image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-05T09:48:08.897737image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-05T09:48:09.132096image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-05T09:48:09.396122image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-08-05T09:48:09.661727image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-08-05T09:48:00.870358image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-05T09:48:01.417190image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Invoice IDBranchCityCustomer typeGenderProduct lineUnit priceQuantityTax 5%TotalDateTimePaymentcogsgross margin percentagegross incomeRatingYearsMonthDayWeeday
0750-67-8428AYangonMemberFemaleHealth and beauty74.69726.1415548.97152019-01-0513:08Ewallet522.834.76190526.14159.12019JanuarySaturday5
1226-31-3081CNaypyitawNormalFemaleElectronic accessories15.2853.820080.22002019-03-0810:29Cash76.404.7619053.82009.62019MarchFriday4
2631-41-3108AYangonNormalMaleHome and lifestyle46.33716.2155340.52552019-03-0313:23Credit card324.314.76190516.21557.42019MarchSunday6
3123-19-1176AYangonMemberMaleHealth and beauty58.22823.2880489.04802019-01-2720:33Ewallet465.764.76190523.28808.42019JanuarySunday6
4373-73-7910AYangonNormalMaleSports and travel86.31730.2085634.37852019-02-0810:37Ewallet604.174.76190530.20855.32019FebruaryFriday4
5699-14-3026CNaypyitawNormalMaleElectronic accessories85.39729.8865627.61652019-03-2518:30Ewallet597.734.76190529.88654.12019MarchMonday0
6355-53-5943AYangonMemberFemaleElectronic accessories68.84620.6520433.69202019-02-2514:36Ewallet413.044.76190520.65205.82019FebruaryMonday0
7315-22-5665CNaypyitawNormalFemaleHome and lifestyle73.561036.7800772.38002019-02-2411:38Ewallet735.604.76190536.78008.02019FebruarySunday6
8665-32-9167AYangonMemberFemaleHealth and beauty36.2623.626076.14602019-01-1017:15Credit card72.524.7619053.62607.22019JanuaryThursday3
9692-92-5582BMandalayMemberFemaleFood and beverages54.8438.2260172.74602019-02-2013:27Credit card164.524.7619058.22605.92019FebruaryWednesday2

Last rows

Invoice IDBranchCityCustomer typeGenderProduct lineUnit priceQuantityTax 5%TotalDateTimePaymentcogsgross margin percentagegross incomeRatingYearsMonthDayWeeday
990886-18-2897AYangonNormalFemaleFood and beverages56.56514.1400296.94002019-03-2219:06Credit card282.804.76190514.14004.52019MarchFriday4
991602-16-6955BMandalayNormalFemaleSports and travel76.601038.3000804.30002019-01-2418:10Ewallet766.004.76190538.30006.02019JanuaryThursday3
992745-74-0715AYangonNormalMaleElectronic accessories58.0325.8030121.86302019-03-1020:46Ewallet116.064.7619055.80308.82019MarchSunday6
993690-01-6631BMandalayNormalMaleFashion accessories17.49108.7450183.64502019-02-2218:35Ewallet174.904.7619058.74506.62019FebruaryFriday4
994652-49-6720CNaypyitawMemberFemaleElectronic accessories60.9513.047563.99752019-02-1811:40Ewallet60.954.7619053.04755.92019FebruaryMonday0
995233-67-5758CNaypyitawNormalMaleHealth and beauty40.3512.017542.36752019-01-2913:46Ewallet40.354.7619052.01756.22019JanuaryTuesday1
996303-96-2227BMandalayNormalFemaleHome and lifestyle97.381048.69001022.49002019-03-0217:16Ewallet973.804.76190548.69004.42019MarchSaturday5
997727-02-1313AYangonMemberMaleFood and beverages31.8411.592033.43202019-02-0913:22Cash31.844.7619051.59207.72019FebruarySaturday5
998347-56-2442AYangonNormalMaleHome and lifestyle65.8213.291069.11102019-02-2215:33Cash65.824.7619053.29104.12019FebruaryFriday4
999849-09-3807AYangonMemberFemaleFashion accessories88.34730.9190649.29902019-02-1813:28Cash618.384.76190530.91906.62019FebruaryMonday0